Careers

←Job Openings

Hadoop Developer - San Jose, CA Refer a Person Apply for this Job

Job Description

  • Lead the architecture, design, and development of components and services to enable Machine Learning at scale.
  • Responsible for data ingestion from disparate data sources like SAP sources and non-SAP sources such as iEnergy, MDMS and maintaining and extending the big data platform infrastructure that supports client’s business use cases.
  • Identify and recommend the most appropriate paradigms and technology choices for batch and real-time scenarios.
  • Work on finding cluster level solutions for our complex system and developed enterprise level applications followed by unit testing.
  • Building the pipelines from Source to SFTP, SFTP to Hadoop landing layer using Talend.
  • Develop an automated data ingestion framework using Talend to synchronize the Hadoop data with SAP HANA and vice-versa.
  • Run complex queries and work on Bucketing, Partitioning, Joins and sub-queries.  
  • Write Big Data Advanced business application programs code in both functional and object-oriented programming.
  • Implement of the complex transformations and actions using the dataframes and data sets from SPARK/SCALA.
  • Develop standalone applications in Spark/Scala that reads error logs from multiple upstream data sources and run validations on it.
  • Write build scripts to build applications using tools like Apache Maven, Ant, Sbt and deploy the code using Jenkins for CI/CD.
  • Work on writing complex workflow jobs using Redwood and set up multiple programs scheduler system which helped in managing multiple Hadoop, Hive, Sqoop, Spark jobs.
  • Closely monitor the pipeline jobs and worked on failed jobs. Deal with setting up several new property configurations within Redwood SC.
  • Work on developing Kafka producers that listen to several streaming data with-in a specified duration.
  • Teach and mentor other engineers on the team.
  • Document the functional and technical requirements by following company defined processes and methodologies.
  • Perform data cleanups and validations on streaming data using spark, spark streaming and Scala.

Required Skills:

  • A minimum of bachelor's degree in computer science or equivalent.
  • Cloudrea Hadoop(CDH), Cloudera Manager, Informatica Bigdata Edition(BDM), HDFS, Yarn, MapReduce, Hive, Impala, KUDU, Sqoop, Spark, Kafka, HBase, Teradata Studio Express, Teradata, Tableau, Kerberos, Active Directory, Sentry, TLS/SSL, Linux/RHEL, Unix Windows, SBT, Maven, Jenkins, Oracle, MS SQL Server, Shell Scripting, Eclipse IDE, Git, SVN
  • Must have strong problem-solving and analytical skills
  • Must have the ability to identify complex problems and review related information to develop and evaluate options and implement solutions.

If you are interested in working in a fast-paced, challenging, fun, entrepreneurial environment and would like to have the opportunity of being a part of this fascinating industry, Send resumes. to HSTechnologies LLC, 2801 W Parker Road, Suite #5 Plano, TX - 75023 or email your resume to hr@sbhstech.com.